MapReduce Based Parallel Neural Networks in Enabling Large Scale Machine Learning

نویسندگان

  • Yang Liu
  • Jie Yang
  • Yuan Huang
  • Lixiong Xu
  • Siguang Li
  • Man Qi
چکیده

Artificial neural networks (ANNs) have been widely used in pattern recognition and classification applications. However, ANNs are notably slow in computation especially when the size of data is large. Nowadays, big data has received a momentum from both industry and academia. To fulfill the potentials of ANNs for big data applications, the computation process must be speeded up. For this purpose, this paper parallelizes neural networks based on MapReduce, which has become a major computing model to facilitate data intensive applications. Three data intensive scenarios are considered in the parallelization process in terms of the volume of classification data, the size of the training data, and the number of neurons in the neural network. The performance of the parallelized neural networks is evaluated in an experimental MapReduce computer cluster from the aspects of accuracy in classification and efficiency in computation.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

MapReduce-based Parallel Learning for Large-scale Remote Sensing Im- ages

Machine learning applied to large-scale remote sensing images shows inadequacies in computational capability and storage space. To solve this problem, we propose a cloud computing-based scheme for learning remote sensing images in a parallel manner: (1) a hull vector-based hybrid parallel support vector machine model (HHB-PSVM) is proposed. It can substantially improve the efficiency of trainin...

متن کامل

Forward kinematic analysis of planar parallel robots using a neural network-based approach optimized by machine learning

The forward kinematic problem of parallel robots is always considered as a challenge in the field of parallel robots due to the obtained nonlinear system of equations. In this paper, the forward kinematic problem of planar parallel robots in their workspace is investigated using a neural network based approach. In order to increase the accuracy of this method, the workspace of the parallel robo...

متن کامل

A Grid Based System for Data Mining Using MapReduce

In this paper, we discuss a Grid data mining system based on the MapReduce paradigm of computing. The MapReduce paradigm emphasizes system automation of fault tolerance and redundancy, while keeping the programming model for the user very simple. MapReduce is built closely on top of a distributed file system, that allows efficient distributed storage of large data sets, and allows computation t...

متن کامل

Large-scale Artificial Neural Network: MapReduce-based Deep Learning

Faced with continuously increasing scale of data, original back-propagation neural network based machine learning algorithm presents two non-trivial challenges: huge amount of data makes it difficult to maintain both efficiency and accuracy; redundant data aggravates the system workload. This project is mainly focused on the solution to the issues above, combining deep learning algorithm with c...

متن کامل

Hybrid Parallelization Strategies for Large-Scale Machine Learning in SystemML

SystemML aims at declarative, large-scale machine learning (ML) on top of MapReduce, where high-level ML scripts with R-like syntax are compiled to programs of MR jobs. The declarative specification of ML algorithms enables—in contrast to existing large-scale machine learning libraries— automatic optimization. SystemML’s primary focus is on data parallelism but many ML algorithms inherently exh...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره 2015  شماره 

صفحات  -

تاریخ انتشار 2015